Model-based clustering of array CGH data
نویسندگان
چکیده
MOTIVATION Analysis of array comparative genomic hybridization (aCGH) data for recurrent DNA copy number alterations from a cohort of patients can yield distinct sets of molecular signatures or profiles. This can be due to the presence of heterogeneous cancer subtypes within a supposedly homogeneous population. RESULTS We propose a novel statistical method for automatically detecting such subtypes or clusters. Our approach is model based: each cluster is defined in terms of a sparse profile, which contains the locations of unusually frequent alterations. The profile is represented as a hidden Markov model. Samples are assigned to clusters based on their similarity to the cluster's profile. We simultaneously infer the cluster assignments and the cluster profiles using an expectation maximization-like algorithm. We show, using a realistic simulation study, that our method is significantly more accurate than standard clustering techniques. We then apply our method to two clinical datasets. In particular, we examine previously reported aCGH data from a cohort of 106 follicular lymphoma patients, and discover clusters that are known to correspond to clinically relevant subgroups. In addition, we examine a cohort of 92 diffuse large B-cell lymphoma patients, and discover previously unreported clusters of biological interest which have inspired followup clinical research on an independent cohort. AVAILABILITY Software and synthetic datasets are available at http://www.cs.ubc.ca/ approximately sshah/acgh as part of the CNA-HMMer package. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
منابع مشابه
Stability-Based Comparison of Class Discovery Methods for DNA Copy Number Profiles
MOTIVATION Array-CGH can be used to determine DNA copy number, imbalances in which are a fundamental factor in the genesis and progression of tumors. The discovery of classes with similar patterns of array-CGH profiles therefore adds to our understanding of cancer and the treatment of patients. Various input data representations for array-CGH, dissimilarity measures between tumor samples and cl...
متن کاملP-112: PGS-Array-CGH Technique: New Technical Approach to Promotion ART Outcome
Background Chromosomal abnormalities are common in embryos from assisted reproductive technology, ranging from 60% abnormal embryos in women
متن کاملSpatial clustering of array CGH features in combination with hierarchical multiple testing.
We propose a new approach for clustering DNA features using array CGH data from multiple tumor samples. We distinguish data-collapsing (joining contiguous DNA clones or probes with extremely similar data into regions) from clustering (joining contiguous, correlated regions based on a maximum likelihood principle). The model-based clustering algorithm accounts for the apparent spatial patterns i...
متن کاملModel-based subspace clustering
We discuss a model-based approach to identifying clusters of objects based on subsets of attributes, so that the attributes that distinguish a cluster from the rest of the population may depend on the cluster being considered. The method is based on a Pólya urn cluster model for multivariate means and variances, resulting in a multivariate Dirichlet process mixture model. This particular model-...
متن کاملClustering based on Dirichlet mixtures of attribute ensembles
We propose a model-based approach to identifying clusters of objects based on subsets of attributes, so that the attributes that distinguish a cluster from the rest of the population, called an attribute ensemble, may depend on the cluster being considered. The model is based on a Pólya urn cluster model, which is equivalent to a Dirichlet process mixture of multivariate normal distributions. T...
متن کاملکاربردهای فن دو رگهسازی ژنومی مقایسهای آرایه در سرطان و بیماریهای ژنتیکی: مقاله مروری
Normal 0 false false false EN-US X-NONE AR-SA MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal" mso-tsty...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Bioinformatics
دوره 25 شماره
صفحات -
تاریخ انتشار 2009